Submodular Optimization Over Sliding Windows

نویسندگان

  • Alessandro Epasto
  • Silvio Lattanzi
  • Sergei Vassilvitskii
  • Morteza Zadimoghaddam
چکیده

Maximizing submodular functions under cardinality constraints lies at the core of numerous data mining and machine learning applications, including data diversification, data summarization, and coverage problems. In this work, we study this question in the context of data streams, where elements arrive one at a time, and we want to design lowmemory and fast update-time algorithms that maintain a good solution. Specifically, we focus on the sliding window model, where we are asked to maintain a solution that considers only the last W items. In this context, we provide the first non-trivial algorithm that maintains a provable approximation of the optimum using space sublinear in the size of the window. In particular we give a 1/3 − ǫ approximation algorithm that uses space polylogarithmic in the spread of the values of the elements, Φ, and linear in the solution size k for any constant ǫ > 0. At the same time, processing each element only requires a polylogarithmic number of evaluations of the function itself. When a better approximation is desired, we show a different algorithm that, at the cost of using more memory, provides a 1/2−ǫ approximation, and allows a tunable trade-off between average update time and space. This algorithm matches the best known approximation guarantees for submodular optimization in insertion-only streams, a less general formulation of the problem. We demonstrate the efficacy of the algorithms on a number of real world datasets, showing that their practical performance far exceeds the theoretical bounds. The algorithms preserve high quality solutions in streams with millions of items, while storing a negligible fraction of them.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Submodular Maximization over Sliding Windows

In this paper we study the extraction of representative elements in the data stream model in the form of submodular maximization. Different from the previous work on streaming submodular maximization, we are interested only in the recent data, and study the maximization problem over sliding windows. We provide a general reduction from the sliding window model to the standard streaming model, an...

متن کامل

Efficient Streaming Algorithms for Submodular Maximization with Multi-Knapsack Constraints

Submodular maximization (SM) has become a silver bullet for a broad class of applications such as influence maximization, data summarization, top-k representative queries, and recommendations. In this paper, we study the SM problem in data streams. Most existing algorithms for streaming SM only support the append-only model with cardinality constraints, which cannot meet the requirements of rea...

متن کامل

Querying Sliding Windows Over Online Data Streams

A data stream is a real-time, continuous, ordered sequence of items generated by sources such as sensor networks, Internet traffic flow, credit card transaction logs, and on-line financial tickers. Processing continuous queries over data streams introduces a number of research problems, one of which concerns evaluating queries over sliding windows defined on the inputs. In this paper, we descri...

متن کامل

On learning to localize objects with minimal supervision

Learning to localize objects with minimal supervision is an important problem in computer vision, since large fully annotated datasets are extremely costly to obtain. In this paper, we propose a new method that achieves this goal with only image-level labels of whether the objects are present or not. Our approach combines a discriminative submodular cover problem for automatically discovering a...

متن کامل

Static Optimization of Conjunctive Queries with Sliding Windows over Unbounded Streaming Information Sources

We study the problem of static optimization of conjunctive queries with sliding window joins over unbounded streaming information sources. While previous work has suggested focusing on maximizing the output rate of queries over streaming information sources, we show that in steady-state, for conjunctive queries with sliding windows over unbounded streams, all feasible plans have the same output...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017